Main

Lucas Moraes

I am a data specialist that works with both data science and analytics. Working in very different environments gave me keen communication skills directed to both technical and non technical people, an asset I am valued for alongside my creativity. As a geneticist/bioinformatician by formation, I have been using data science tools for over a decade in academic, governmental and industry projects. Although I am not currently working in the natural sciences/health sector, my dream is to have a career in the sector. These are not common jobs in Brazil though, where the job market is highly tilted towards fintechs and retail. I am and will always consider myself as a life science person before anything else though.


Relevant professional experience

Data Scientist

Carrefour

N/A

Present - 2023

  • Surplus supply modeling for promotional events across the country
  • Scalable experiment design and modeling using PySpark and Google Cloud Platform

Senior Data Analyst

PicPay

N/A

2022 - 2021

  • Data analysis and statistical support for the behavioral segmentation model of customers (Gaussian Mixture Models).
  • Data analysis, experimental design, data wrangling and experimentation monitoring for A/B testing of features in the suggestions session of the app home page.
  • Data analysis for integrity check of models in production. Machine learning modeling for client propension studies.
  • Analytical pipelines using GitHub + Databricks + Airflow for the creation of custom on demand tables in the data lake.

Environmental Analyst

National Centre for Flora Conservation (CNCFlora)

N/A

2018 - 2013

  • Data analyses of structured and unstructured real world data for scientific reports guiding governmental public policies regarding Brazil’s biodiversity conservation.
  • Scientific reports with project milestones, results, advertising topics and statistical analyses aimed at different stakeholders.









Education

Technical stack __________________

R

Python

SQL

Spark

Statistics

Machine Learning

Data Viz

Fluent english

Soft Skills

MsC, Genetics

Rio de Janeiro Federal University

N/A

2018 - 2016

  • Hierarchical clustering (unsupervised machine learning) and dendrogram analyses for evolutionary distinct lineages through the integration of biological, geographical and molecular unstructured data.
  • Advisor: Carlos Guerra Schrago.

BsC, Genetics

Rio de Janeiro Federal University

N/A

2012 - 2007

  • Phylogenetic and topological estimation of cetaceans using bayesian and maximum likelihood methods for hierarchical clustering.
  • Publication: Phylogenetic Status and Timescale for the Diversification of Steno and Sotalia Dolphins. PLOS ONE. https://doi.org/10.1371/journal.pone.0028297
  • Advisor: Carlos Guerra Schrago.

About me

Skills & Stack

N/A

N/A

N/A

  • My area of expertise is tilted towards using statistics (Machine Learning included) to describe and understand client behavior, development of predictive models (e.g. churn or propension) and client segmentation using unsupervised ML. I have experience developing ad hoc analyses and also developing production ready models.
  • Statistical modeling, data analysis and hypothesis testing are considered staple knowledge in the area I graduated from. This fact allowed me to transit between different technical areas seamlessly (such as machine learning, A/B testing and data visualization).
  • R is my language of choice, but I also have proficiency with Python, SQL and Spark (PySpark/Sparklyr). I work comfortably with git, github and Linux, besides also having experience working with the Data Bricks environment (and with notebooks in general) and AWS (e.g. Redshift and S3).
  • I have worked in several research projects with people from a wide array of backgrounds and seniority levels, some of which from end-to-end. I have been well trained since an undergrad to explain technical subjects to non technical audiences.